Diamonds

Quinty Boer

Diamonds are shiny …

Diamond image from Unsplash (Dan 2022)

… and sparkly

Diamond gif from Tenor (AnitaCruz2324 2025)

Let’s look at the diamonds dataset from ggplot2

Histogram of diamond carat

Size matters!

A histogram of diamond weight (carat), using 10 bins.

A histogram of diamond weight (carat), using 100 bins.

How about a linear regression?


The regression equation:


\(Y_{\text{price}} = \beta_0 + \beta_1 X_{\text{carat}} + \beta_2 X_{\text{cut}} + \\ \beta_3 X_{\text{depth}} + \beta_4 X_{\text{x}} + \beta_5 X_{\text{y}} + \beta_6 X_{\text{z}}\)

R code for the regression


# load diamonds dataset
library(ggplot2)
data(diamonds)

# randomly sample 5000 diamonds
set.seed(42)
diamonds_sample <- diamonds[sample(nrow(diamonds), 5000),
                            c("carat", "cut", "depth", "x",
                              "y", "z", "price")]

# fit a linear regression model
model <- lm(price ~ carat + cut + depth + x + y + z,
            data = diamonds_sample)
summary(model)

The dataset used in this analysis is included in the ggplot2 package (Wickham 2016).

References

AnitaCruz2324. 2025. Diamond Sticker. Tenor. https://tenor.com/en-GB/view/diamond-gif-5451697367070963787.
Dan, Daniel. 2022. Diamonds Closeup Jewelry Gem. Unsplash. https://unsplash.com/photos/a-pair-of-diamond-shaped-rings-gUkgk1GIGdQ.
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org.